# IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH

## A COMPARATIVE STUDY OF CLOCK GATED ETCAM WITH ZTCAM

Ms.Jenifer Anna Jose<sup>\*</sup>, Ms.Jency Andrews

\* Department of Electronics and Communication, MLMCE, Kerala, INDIA

#### ABSTRACT

Ternary Content Addressable memory is a kind of memory which allows the required memory to be searched by content, instead of searching by address. It implements high speed lookup operations within a single clock cycle. But when compared to RAM technology the conventional TCAM circuitry has certain precincts such as longer access time, low storage capacity, circuit density and high cost. This brief proposes a novel memory architecture, named clock gated Z-TCAM, which emulates the TCAM functionality with SRAM. Conventional Z-TCAM architecture divides the classical TCAM table columns and rows into hybrid TCAM sub tables, which will be processed to map on their respective memory blocks. Proposed architecture offers better search performance ,scalability and a better power delay product than the conventional ZTCAM. So these SRAM based TCAM structure can be used in networking applications such as packet switching and packet classification in communication systems.

**Key words**: TCAM: Ternary content addressable memory, ETCAM: Evolved TCAM, SRAM –static random addressable memory

#### **INTRODUCTION**

[1]Ternary content addressable memory (TCAM) is an outgrowth of random access memory (RAM) but unlike RAM, TCAM provides access to stored data by contents rather than by an address and outputs the match address. Since TCAM can store don't care state (x) which can be matched to both 0 and 1 during a comparison operation ,multiple matches may occur. A typical CAM compares search key with all the stored words in parallel and returns the address of the best match. Since TCAM provides high speed parallel search operation, it has a wide use of applications such as it can find its applications in network routers, translation look-aside buffers in microprocessors, data compression, real-time pattern matching in virus-detection, intrusion-detection systems, gene pattern searching in bioinformatics, and image processing.

The primary application of TCAM is in network systems where to compare the destination address of incoming packet against the stored addresses and forward the packet to the appropriate output port. Although CAM technology presents a major advantage of a single clock cycle comparison over standard RAM, yet it also has short comings. TCAM is not subjected to the intense commercial competition found in the RAM market and yet to gain a substantial market share. The cost of TCAM is about 30 times more per bit of storage than the SRAM cost. For providing parallel search operation, TCAM needs comparison circuitry in each cell, which dictates that CAM density lags RAM density. The comparison circuitry in each cell not only makes TCAM expensive but also adds complexity to the TCAM architecture. The extra logic and capacitive loading due to the massive parallelism lengthen the access time of TCAMs, which is over 3.3 times longer than the SRAM access time [2,3].

The proposed E-TCAM may be used in networking systems where many data need to be compared in parallel at high speed. Currently, TCAMs are used in networking systems but they are expensive and not scalable with respect to clock rate or circuit area compared with SRAMs.Thus, SRAM- and FPGA-based TCAMs can be used in networking chips to achieve high speed and high throughput.

E-TCAM offers scalability and lower cost than the classical TCAM devices, provided that SRAM devices are denser, cheaper, and operate faster than the TCAM devices. E-TCAM achieves deterministic lookup throughput .

#### A BRIEF ABOUT CONVENTIONAL ETCAM

The overall architecture of E-TCAM is illustrated in Fig. 1 each layer embodies the architecture shown in Fig.2. Design has got L number of layers and a CAM priority encoder (CPE). Each layer ,after processing will give a potential match address (PMA). The PMAs are fed to CPE, which selects match address (MA) among PMAs.

http://www.ijesrt.com © International Journal of Engineering Sciences & Research Technology



Figure 1 OVERALL ARCHITECTURE OF ETCAM[1]

## LAYER ARCHITECTURE

Layer architecture is explained below in Fig.2. Each layer will be having N validation memories (VMs), 1- bit AND operation, N original address table address memories (OATAMs), N original address tables (OATs), K-bit AND operation, and a layer priority encoder (LPE).



Figure 2.Layer Architecture of ETCAM

## 1) Validation Memory:

Size of each VM is  $2w \times 1$  bits where w represents the number of bits in each subword and 2w shows the number of rows. A subword of w bits implies that it has total combinations of 2w where each combination represents a subword.[1] Each subword acts as an address to VM. If the memory location be invoked by a subword is high, it means that the input subword is present, otherwise absent. Thus, VM validates the input subword, if it is present.

### 2) 1-bit AND Operation:

It ANDs the output of all VMs. The output of 1-bit AND operation decides the continuation of a search operation. If the result of 1-bit AND operation is high, then it permits the continuation of a search operation, otherwise mismatch occurs in the corresponding layer.[1]

#### 3) Original Address Table:

Dimensions of OAT are  $2w \times K$  where w is the number of bits in a subword, 2w represents number of rows, and K is the number of bits in each row where each bit represents an original address. Here K is a subset of original addresses from conventional TCAM table. It is OAT, which considers the storage of original addresses[1].

http://www.ijesrt.com

© International Journal of Engineering Sciences & Research Technology

#### 4) K-bit AND Operation AND Layer Priority Encoder:

It ANDs bit-by-bit the read out *K*-bit rows from all OATs and forwards the result to LPE.because we emulate TCAM and multiple matches may occur in TCAM, the LPE selects PMA among the outputs of *K*-bit AND operation.[1]

| <u>STEP</u> | ACTIVITY                                                                                           |  |  |  |  |  |
|-------------|----------------------------------------------------------------------------------------------------|--|--|--|--|--|
| 1           | Sub-word is selected .                                                                             |  |  |  |  |  |
| 2           | Applied simultaneously to their Validating memory.                                                 |  |  |  |  |  |
| 3           | Read all the VMs concurrently                                                                      |  |  |  |  |  |
| 4           | If VM is validated then Search operation is Sustained.                                             |  |  |  |  |  |
| 5           | OAT module will be activated ,only the<br>address location activated from VM<br>module is read out |  |  |  |  |  |
| 6           | K-bit and operation                                                                                |  |  |  |  |  |
| 7           | Gives to the LPE to select PMA /mismatch.                                                          |  |  |  |  |  |

 Table 1: Flow of Search Operation in LAYER of ZTCAM

The above Table1 describes searching in a layer of E-TCAM. *N* subwords are simultaneously applied to various layer. The subwords then validate the corresponding memory locations from their respective Validating memories. If all VMs validate their corresponding subwords (equivalent to 1-bit AND operation), activation signal is generated which is the enable of OAT module, otherwise mismatch occurs in the layer. Upon validation of all subwords, the subwords read out their respective memory locations from the respective VMs concurrently and output their corresponding OATAs. All OATAs will be giving *K*-bit rows from their corresponding OATs simultaneously, which are then bitwise ANDed. LPE selects PMA from the result of the *K*-bit AND operation.

#### Proposed System: CLOCK GTAED Z-TCAM

The proposed design can much more efficiently designed doesn't use any method to initialize the blocks inside the layer. Once the input is received all the internal blocks will be initialized and waits for the each stages out to get them worked. But this will lead to large static power dissipation which is a least unwanted in low power design end. Entire design of clock gated ZTCAM is same as previous design with additional module of clock gating .

In this technique the modification is made in both overall architecture and inside the layer architecture



Figure 3. Clock Gating Module

Working of clock gating module is described in the table.

| IN READ<br>MODE |                                |
|-----------------|--------------------------------|
| WR=1            | SEL_LINE=1<br>CLOCK_OUT=0      |
| INWRITE<br>MODE |                                |
| WR=0            | SEL_LINE =0<br>CLOCK OUT=CLOCK |
| ENABLE=1        | clock_001-clock                |
| WR=0            | SEL_LINE=1<br>CLOCK OUT=0      |
| ENABLE=0        |                                |

### OVERALL ARCHITECTURE OF CLOCK GATED ETCAM

In the overall architecture, the input will be given to the layer only if the clock is there and enable pin is high .Then the internal blocks are processed and PMAs are processed .Once the PMAs are generated ,then only the CAM priority encoder will be initialized. Till then that particular block is not madetowork..Fig 5 depicts the overall architecture of Clock Gated ETCAM.



Figure 4. Clock Gated ZTCAM Architecture

## **IMPLEMENTATION AND RESULTS**

Z-TCAM architecture was implemented in Verilog-HDL on Xilinx Spartan(xc7va100t-3csg324) FPGA as the target using Xilinx 13.2 Synthesis and Implementation Tool. We have verified its functionality using different test vectors using Xilinx ISim Simulator. Implementation followed the step explained as above. Resource utilization and maximum frequency of two designs were observed with the existing architecture.

| L | J      | К         | L           | М           | N           |
|---|--------|-----------|-------------|-------------|-------------|
|   | Supply | Summary   | Total       | Dynamic     | Quiescent   |
|   | Source | Voltage   | Current (A) | Current (A) | Current (A) |
|   | Vccint | 1.200     | 0.003       | 0.001       | 0.002       |
|   | Vccaux | 2.500     | 0.003       | 0.000       | 0.003       |
|   | Vcco25 | 2.500     | 0.000       | 0.000       | 0.000       |
|   |        |           |             |             |             |
|   |        |           | Total       | Dynamic     | Quiescent   |
|   | Supply | Power (W) | 0.012       | 0.001       | 0.010       |
|   |        |           |             |             |             |

Figure 5. Power Consumption Of Clock Gated ZTCAM Using XPower Analyzer

http://www.ijesrt.com



Figure 6. Waveform Of Clockgated ETCAM

## **COMPARISON OF RESULTS**

| <u>Architecture</u><br>parametersL=2,K=2,N=4 | <u>No of</u><br><u>Flipflops</u> | <u>No Of</u><br><u>LUTS</u> | POWER<br>DELAY<br>PRODUCT |
|----------------------------------------------|----------------------------------|-----------------------------|---------------------------|
| CLOCK GATED ZTCAM                            | 139                              | 298                         | 0.029                     |
| CLOCKGATED ETCAM                             | 124                              | 281                         | 0.017                     |

### **CONCLUSION**

The RAM technology is more striking than CAM technology due to numerous valid factors. So the SRAM can be configured to behave like TCAM. Here the TCAM functionality is emulated with SRAM. The memory architecture is stranded on the Hybrid Partitioning perception, which cut up the conventional TCAM table into rows and columns and then it is managed to be stored in their corresponding SRAM units. The memory architecture utilizes the benefits of RAM technology along with the feasibility of FPGA. The output is the Match Address(MA), which represents the original address at which the search word is stored.

The proposed E-TCAM is simpler, and easily scalable (owing to easy scalability of SRAM for large size TCAM. The proposed one can be easily composed in ASIC or FPGA environment

The conventional TCAMS in packet forwarding engines, pattern matching systems and the tag matching in caches can be replaced by the SRAM based TCAMs, which uses the benefits of RAM technology and feasibility of FPGA technology.

#### REFERENCE

- 1. Zahid Ullah, Manish K. Jaiswal, and Ray C. C. Cheung, "Z-TCAM: An SRAM-based Architecture for TCAM," IEEE Transaction on Very Large Scale Integration.(VLSI) Systems,vol.23,no.2, pp.402-406, 2015.
- 2. W. Jiang and V. K. Prasanna, "Large-scale wire-speed packet classification on FPGAs," in Proc. ACM/SIGDA Int. Symp. Field Program. Gate Arrays, 2009, pp. 219–228.
- 3. W. Jiang and V. Prasanna, "Parallel IP lookup using multiple SRAM based pipelines," in Proc. IEEE Int. Symp. Parallel Distrib. Process., Apr. 2008, pp. 1–14.
- 4. S. Dharmapurikar, P. Krishnamurthy, and D. Taylor, "Longest prefix matching using bloom filters," IEEE/ACM Trans. Netw., Apr. 2006.
- 5. <u>https://supportforums.cisco.com/document/60831/cam-content-addressable-memory-vs-tcam-ternary</u>
- 6. W. Jiang and V. Prasanna, "Scalable packet classification on FPGA,"IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 9, pp. 1668–1680, Sep. 2012.
- 7. K.Pagiamtzis, A. Sheikholeslami," Content-addressable memory (CAM) circuits and architecture tutorial and survey", IEEE J. Solid- State Circuits, vol. 41, no. 3, pp.